AITopics | aerial imagery

Collaborating Authors

aerial imagery

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

HALO: High-Altitude Language-Conditioned Monocular Aerial Exploration and Navigation

Tao, Yuezhan, Ong, Dexter, Cladera, Fernando, Hughes, Jason, Taylor, Camillo J., Chaudhari, Pratik, Kumar, Vijay

arXiv.org Artificial IntelligenceNov-24-2025

Abstract-- We demonstrate real-time high-altitude aerial metric-semantic mapping and exploration using a monocular camera paired with a global positioning system (GPS) and an inertial measurement unit (IMU). Our system, named HALO, addresses two key challenges: (i) real-time dense 3D reconstruction using vision at large distances, and (ii) mapping and exploration of large-scale outdoor environments with accurate scene geometry and semantics. We demonstrate that HALO can plan informative paths that exploit this information to complete missions with multiple tasks specified in natural language. We use real-world experiments on a custom quadrotor platform to demonstrate that (i) all modules can run onboard the robot, and that (ii) in diverse environments HALO can support effective autonomous execution of missions covering up to 24,600 sq. Experiment videos and more details can be found on our project page: https://tyuezhan.github. Aerial robots operating at high altitudes have a large effective field-of-view, this can be used very effectively for mapping and exploration. However, high-altitude aerial operations present some unusual challenges in perception. For example, consumer-grade LiDARs provide accurate depth but the point density at large distances is low. LiDARs are also expensive, heavy and do not provide the same richness of information as cameras. Vision-based systems are also more attractive because they are inexpensive and lightweight.

artificial intelligence, machine learning, natural language, (18 more...)

arXiv.org Artificial Intelligence

2511.17497

Genre: Research Report (0.50)

Technology:

Information Technology > Artificial Intelligence > Robots (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Natural Language (1.00)
(2 more...)

Add feedback

AI Powered Urban Green Infrastructure Assessment Through Aerial Imagery of an Industrial Township

Dutta, Anisha

arXiv.org Artificial IntelligenceOct-28-2025

Accurate assessment of urban canopy coverage is crucial for informed urban planning, effective environmental monitoring, and mitigating the impacts of climate change. Traditional practices often face limitations due to inadequate technical requirements, difficulties in scaling and data processing, and the lack of specialized expertise. This study presents an efficient approach for estimating green canopy coverage using artificial intelligence, specifically computer vision techniques, applied to aerial imageries. Our proposed methodology utilizes object-based image analysis, based on deep learning algorithms to accurately identify and segment green canopies from high-resolution drone images. This approach allows the user for detailed analysis of urban vegetation, capturing variations in canopy density and understanding spatial distribution. To overcome the computational challenges associated with processing large datasets, it was implemented over a cloud platform utilizing high-performance processors. This infrastructure efficiently manages space complexity and ensures affordable latency, enabling the rapid analysis of vast amounts of drone imageries. Our results demonstrate the effectiveness of this approach in accurately estimating canopy coverage at the city scale, providing valuable insights for urban forestry management of an industrial township. The resultant data generated by this method can be used to optimize tree plantation and assess the carbon sequestration potential of urban forests. By integrating these insights into sustainable urban planning, we can foster more resilient urban environments, contributing to a greener and healthier future.

artificial intelligence, canopy coverage, machine learning, (16 more...)

arXiv.org Artificial Intelligence

2510.21876

Genre: Research Report > New Finding (0.54)

Industry: Information Technology > Services (0.49)

Technology:

Information Technology > Sensing and Signal Processing > Image Processing (1.00)
Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

Mapping Farmed Landscapes from Remote Sensing

Conserva, Michelangelo, Wilson, Alex, Stanton, Charlotte, Batchu, Vishal, Gulshan, Varun

arXiv.org Artificial IntelligenceOct-17-2025

Effective management of agricultural landscapes is critical for meeting global biodiversity targets, but efforts are hampered by the absence of detailed, large-scale ecological maps. To address this, we introduce Farmscapes, the first large-scale (covering most of England), high-resolution (25cm) map of rural landscape features, including ecologically vital elements like hedgerows, woodlands, and stone walls. This map was generated using a deep learning segmentation model trained on a novel, dataset of 942 manually annotated tiles derived from aerial imagery. Our model accurately identifies key habitats, achieving high f1-scores for woodland (96\%) and farmed land (95\%), and demonstrates strong capability in segmenting linear features, with an F1-score of 72\% for hedgerows. By releasing the England-wide map on Google Earth Engine, we provide a powerful, open-access tool for ecologists and policymakers. This work enables data-driven planning for habitat restoration, supports the monitoring of initiatives like the EU Biodiversity Strategy, and lays the foundation for advanced analysis of landscape connectivity.

artificial intelligence, machine learning, stone wall, (17 more...)

arXiv.org Artificial Intelligence

2506.13993

Country: Europe > United Kingdom > England (0.46)

Genre: Research Report (1.00)

Industry:

Food & Agriculture > Agriculture (1.00)
Energy > Renewable > Geothermal > Geothermal Energy Exploration and Development > Geophysical Analysis & Survey (0.52)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
Information Technology > Artificial Intelligence > Vision (0.95)

Add feedback

"Does the cafe entrance look accessible? Where is the door?" Towards Geospatial AI Agents for Visual Inquiries

Froehlich, Jon E., Hwang, Jared, Wang, Zeyu, O'Meara, John S., Su, Xia, Huang, William, Zhang, Yang, Fiannaca, Alex, Nelson, Philip, Kane, Shaun

arXiv.org Artificial IntelligenceAug-22-2025

Interactive digital maps have revolutionized how people travel and learn about the world; however, they rely on preexisting structured data in GIS databases (e.g., road networks, POI indices), limiting their ability to address geo-visual questions related to what the world looks like. W e introduce our vision for Geo-Visual Agents--multimodal AI agents capable of understanding and responding to nuanced visual-spatial inquiries about the world by analyzing large-scale repositories of geospatial images, including streetscapes (e.g., Google Street View), place-based photos (e.g., TripAdvisor, Y elp), and aerial imagery (e.g., satellite photos) combined with traditional GIS data sources. W e define our vision, describe sensing and interaction approaches, provide three exemplars, and enumerate key challenges and opportunities for future work.

artificial intelligence, machine learning, natural language, (16 more...)

arXiv.org Artificial Intelligence

2508.15752

Country: North America > United States (1.00)

Genre: Research Report (0.40)

Industry:

Government > Regional Government > North America Government > United States Government (0.69)
Transportation > Ground > Road (0.48)

Technology:

Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.68)
Information Technology > Artificial Intelligence > Representation & Reasoning > Agents (0.61)

Add feedback

RareSpot: Spotting Small and Rare Wildlife in Aerial Imagery with Multi-Scale Consistency and Context-Aware Augmentation

Zhang, Bowen, Boulerice, Jesse T., Kuniyil, Nikhil, Mendiratta, Charvi, Kumar, Satish, Shamon, Hila, Manjunath, B. S.

arXiv.org Artificial IntelligenceJun-25-2025

Automated detection of small and rare wildlife in aerial imagery is crucial for effective conservation, yet remains a significant technical challenge. Prairie dogs exemplify this issue: their ecological importance as keystone species contrasts sharply with their elusive presence--marked by small size, sparse distribution, and subtle visual features--which undermines existing detection approaches. To address these challenges, we propose RareSpot, a robust detection framework integrating multi-scale consistency learning and context-aware augmentation. Our multi-scale consistency approach leverages structured alignment across feature pyramids, enhancing fine-grained object representation and mitigating scale-related feature loss. Complementarily, context-aware augmentation strategically synthesizes challenging training instances by embedding difficult-to-detect samples into realistic environmental contexts, significantly boosting model precision and recall. Evaluated on an expert-annotated prairie dog drone imagery benchmark, our method achieves state-of-the-art performance, improving detection accuracy by over 35% compared to baseline methods. Importantly, it generalizes effectively across additional wildlife datasets, demonstrating broad applicability. The RareSpot benchmark and approach not only support critical ecological monitoring but also establish a new foundation for detecting small, rare species in complex aerial scenes.

artificial intelligence, detection, machine learning, (17 more...)

arXiv.org Artificial Intelligence

2506.19087

Country: North America > United States > California > Santa Barbara County > Santa Barbara (0.04)

Genre: Research Report (1.00)

Technology:

Information Technology > Artificial Intelligence > Robots > Autonomous Vehicles > Drones (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Inductive Learning (0.87)

Add feedback

HouseTS: A Large-Scale, Multimodal Spatiotemporal U.S. Housing Dataset

Wang, Shengkun, Sun, Yanshen, Chen, Fanglan, Wang, Linhan, Ramakrishnan, Naren, Lu, Chang-Tien, Chen, Yinlin

arXiv.org Artificial IntelligenceJun-3-2025

Accurate house-price forecasting is essential for investors, planners, and researchers. However, reproducible benchmarks with sufficient spatiotemporal depth and contextual richness for long horizon prediction remain scarce. To address this, we introduce HouseTS a large scale, multimodal dataset covering monthly house prices from March 2012 to December 2023 across 6,000 ZIP codes in 30 major U.S. metropolitan areas. The dataset includes over 890K records, enriched with points of Interest (POI), socioeconomic indicators, and detailed real estate metrics. To establish standardized performance baselines, we evaluate 14 models, spanning classical statistical approaches, deep neural networks (DNNs), and pretrained time-series foundation models. We further demonstrate the value of HouseTS in a multimodal case study, where a vision language model extracts structured textual descriptions of geographic change from time stamped satellite imagery. This enables interpretable, grounded insights into urban evolution. HouseTS is hosted on Kaggle, while all preprocessing pipelines, benchmark code, and documentation are openly maintained on GitHub to ensure full reproducibility and easy adoption.

housets, machine learning, natural language, (20 more...)

arXiv.org Artificial Intelligence

2506.00765

Country: North America > United States > Virginia (0.47)

Genre: Research Report (1.00)

Industry:

Government > Regional Government > North America Government > United States Government (1.00)
Banking & Finance > Trading (1.00)
Banking & Finance > Real Estate (1.00)
Energy > Renewable > Geothermal > Geothermal Energy Exploration and Development > Geophysical Analysis & Survey (0.34)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

Evaluating Global Geo-alignment for Precision Learned Autonomous Vehicle Localization using Aerial Data

Yang, Yi, Zhao, Xuran, Zhao, H. Charles, Yuan, Shumin, Bateman, Samuel M., Huang, Tiffany A., Beall, Chris, Maddern, Will

arXiv.org Artificial IntelligenceMar-18-2025

Recently there has been growing interest in the use of aerial and satellite map data for autonomous vehicles, primarily due to its potential for significant cost reduction and enhanced scalability. Despite the advantages, aerial data also comes with challenges such as a sensor-modality gap and a viewpoint difference gap. Learned localization methods have shown promise for overcoming these challenges to provide precise metric localization for autonomous vehicles. Most learned localization methods rely on coarsely aligned ground truth, or implicit consistency-based methods to learn the localization task -- however, in this paper we find that improving the alignment between aerial data and autonomous vehicle sensor data at training time is critical to the performance of a learning-based localization system. We compare two data alignment methods using a factor graph framework and, using these methods, we then evaluate the effects of closely aligned ground truth on learned localization accuracy through ablation studies. Finally, we evaluate a learned localization system using the data alignment methods on a comprehensive (1600km) autonomous vehicle dataset and demonstrate localization error below 0.3m and 0.5$^{\circ}$ sufficient for autonomous vehicle applications.

artificial intelligence, imagery, localization, (14 more...)

arXiv.org Artificial Intelligence

2503.13896

Country:

Pacific Ocean > North Pacific Ocean > San Francisco Bay (0.04)
North America > United States > California > San Francisco County > San Francisco (0.04)
Europe > Netherlands > North Holland > Amsterdam (0.04)
(2 more...)

Genre: Research Report (0.66)

Industry:

Government > Regional Government > North America Government > United States Government (1.00)
Food & Agriculture > Agriculture (0.68)

Technology: Information Technology > Artificial Intelligence > Robots > Autonomous Vehicles (1.00)

Add feedback

GeoJEPA: Towards Eliminating Augmentation- and Sampling Bias in Multimodal Geospatial Learning

Lundqvist, Theodor, Delvret, Ludvig

arXiv.org Artificial IntelligenceFeb-25-2025

Existing methods for self-supervised representation learning of geospatial regions and map entities rely extensively on the design of pretext tasks, often involving augmentations or heuristic sampling of positive and negative pairs based on spatial proximity. This reliance introduces biases and limits the representations' expressiveness and generalisability. Consequently, the literature has expressed a pressing need to explore different methods for modelling geospatial data. To address the key difficulties of such methods, namely multimodality, heterogeneity, and the choice of pretext tasks, we present GeoJEPA, a versatile multimodal fusion model for geospatial data built on the self-supervised Joint-Embedding Predictive Architecture. With GeoJEPA, we aim to eliminate the widely accepted augmentation- and sampling biases found in self-supervised geospatial representation learning. GeoJEPA uses self-supervised pretraining on a large dataset of OpenStreetMap attributes, geometries and aerial images. The results are multimodal semantic representations of urban regions and map entities that we evaluate both quantitatively and qualitatively. Through this work, we uncover several key insights into JEPA's ability to handle multimodal data.

accessed, learning, representation, (13 more...)

arXiv.org Artificial Intelligence

2503.05774

Country:

Asia > China > Beijing > Beijing (0.04)
Europe > Switzerland (0.04)
North America > United States > Virginia (0.04)
(15 more...)

Genre:

Research Report (1.00)
Overview (0.67)

Industry:

Transportation > Infrastructure & Services (1.00)
Transportation > Ground > Road (1.00)

Technology:

Information Technology > Sensing and Signal Processing > Image Processing (1.00)
Information Technology > Data Science > Data Mining (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Spatial Reasoning (1.00)
(4 more...)

Add feedback

Zero-shot Shark Tracking and Biometrics from Aerial Imagery

Lalgudi, Chinmay K, Leone, Mark E, Clark, Jaden V, Madrigal-Mora, Sergio, Espinoza, Mario

arXiv.org Artificial IntelligenceJan-10-2025

The recent widespread adoption of drones for studying marine animals provides opportunities for deriving biological information from aerial imagery. The large scale of imagery data acquired from drones is well suited for machine learning (ML) analysis. Development of ML models for analyzing marine animal aerial imagery has followed the classical paradigm of training, testing, and deploying a new model for each dataset, requiring significant time, human effort, and ML expertise. We introduce Frame Level ALIgment and tRacking (FLAIR), which leverages the video understanding of Segment Anything Model 2 (SAM2) and the vision-language capabilities of Contrastive Language-Image Pre-training (CLIP). FLAIR takes a drone video as input and outputs segmentation masks of the species of interest across the video. Notably, FLAIR leverages a zero-shot approach, eliminating the need for labeled data, training a new model, or fine-tuning an existing model to generalize to other species. With a dataset of 18,000 drone images of Pacific nurse sharks, we trained state-of-the-art object detection models to compare against FLAIR. We show that FLAIR massively outperforms these object detectors and performs competitively against two human-in-the-loop methods for prompting SAM2, achieving a Dice score of 0.81. FLAIR readily generalizes to other shark species without additional human effort and can be combined with novel heuristics to automatically extract relevant information including length and tailbeat frequency. FLAIR has significant potential to accelerate aerial imagery analysis workflows, requiring markedly less human effort and expertise than traditional machine learning workflows, while achieving superior accuracy. By reducing the effort required for aerial imagery analysis, FLAIR allows scientists to spend more time interpreting results and deriving insights about marine ecosystems.

artificial intelligence, machine learning, shark, (16 more...)

arXiv.org Artificial Intelligence

2501.05717

Country:

North America > United States > California > Santa Clara County > Stanford (0.04)
Pacific Ocean (0.04)
Oceania > New Zealand > North Island > Wellington Region > Wellington (0.04)
(6 more...)

Genre: Research Report (1.00)

Industry:

Information Technology (0.68)
Health & Medicine (0.46)

Technology:

Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Robots > Autonomous Vehicles > Drones (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.95)

Add feedback

Robot Navigation Using Physically Grounded Vision-Language Models in Outdoor Environments

Elnoor, Mohamed, Weerakoon, Kasun, Seneviratne, Gershom, Xian, Ruiqi, Guan, Tianrui, Jaffar, Mohamed Khalid M, Rajagopal, Vignesh, Manocha, Dinesh

arXiv.org Artificial IntelligenceSep-30-2024

We present a novel autonomous robot navigation algorithm for outdoor environments that is capable of handling diverse terrain traversability conditions. Our approach, VLM-GroNav, uses vision-language models (VLMs) and integrates them with physical grounding that is used to assess intrinsic terrain properties such as deformability and slipperiness. We use proprioceptive-based sensing, which provides direct measurements of these physical properties, and enhances the overall semantic understanding of the terrains. Our formulation uses in-context learning to ground the VLM's semantic understanding with proprioceptive data to allow dynamic updates of traversability estimates based on the robot's real-time physical interactions with the environment. We use the updated traversability estimations to inform both the local and global planners for real-time trajectory replanning. We validate our method on a legged robot (Ghost Vision 60) and a wheeled robot (Clearpath Husky), in diverse real-world outdoor environments with different deformable and slippery terrains. In practice, we observe significant improvements over state-of-the-art methods by up to 50% increase in navigation success rate.

navigation, robot, terrain, (15 more...)

arXiv.org Artificial Intelligence

2409.20445

Country: North America > United States > Maryland > Prince George's County > College Park (0.04)

Genre: Research Report > Promising Solution (0.48)

Industry: Energy (0.93)

Technology: Information Technology > Artificial Intelligence > Robots > Locomotion (1.00)

Add feedback